IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



367.40127X00 =vl 




Applicant (s) : 



Miska HANNUKSELA, et al 



Serial No. : 



Filed: 



May 15, 2001 



Title: 



VIDEO CODING 



Group: 



LETTER CLAIMING RIGHT OF PRIORITY 



Honorable Commissioner of 



May 15, 2001 



Patents and Trademarks 
Washington, D.C. 20231 

Sir: 

Under the provisions of 35 USC 119 and 37 CFR 1.55,.. the 
applicant (s) hereby claim(s) the right of priority based on 
United Kingdom Patent Application No. (s) 0011606,1 filed May 
15, 2000. 

A certified copy of said United Kingdom Application is 
attached. 



Respectfully submitted. 



ANTONELLI, TERRY, STOUT & KRAUS, LLP 




Carl I. Brundidge 
Registration No, 29,621 



CIB/nac 
Attachment 
(703) 312-6600 



irts Page Blank (uspto) 




I Office I 






INVESTOR IN PEOPLE 



The Patent Office 
Concept House 



o 

ft. 



South Wales 
NP10 8QQ 



Cardiff Road 
Newport 




I, the undersigned, being an officer duly authorised in accordance with Section 74(1) and (4) 
of the Deregulation & Contracting Out Act 1994, to sign and issue certificates on behalf of the 
Comptroller-General, hereby certify that annexed hereto is a true copy of the documents as 
originally filed in connection with die patent application identified therein. 



In accordance with the Patents (Companies Re-registration) Rules 1982, if a company named 
in this certificate and any accompanying documents has re-registered under the Companies Act 
1980 with the same name as that with which it was registered immediately before re- 
registration save for the substitution as, or inclusion as, the last part of the name of the words 
"public limited company" or their equivalents in Welsh, references to the name of the company 
in this certificate and any accompanying documents shall be treated as references to the name 
with which it is so re-registered. 



In accordance with the rules, the words "public limited company" may be replaced by p. I.e., 
pic, P.L.C. or PLC. 



Re-registration under the Companies Act does not constitute a new legal entity but merely 
subjects the company to certain additional company law rules. 




Dated 



Signed 



CERTIFIED COPY OF 
PRIORITY DOCUMENT 



6/' tttV 2UB1 




An Executive Agency of the Department of Trade and Industry 



This Page Blank (uspto) 




Application No: GB 0011606.1 Examiner: Ms Ceri Witchard 

Claims searched: 1-11 Date of search: 4 December 2000 



Patents Act 1977 

Search Report under Section 17 

D atabases searched: * • 

UK Patent Office collections, including GB, EP, WO & US patent specifications, in: 

UKCl(Ed.R): H4F FBB FRP FR.G FRW 

IntCI(Ed.7): H04N 7/32 7/36 
Other: Online: WPI, EPODOC, JAPIO 



Documents considered to be relevant: 



Category 


Identity of document and relevant passage 


Relevant 
to claims 


A 
A 


EP 0702492 Al NORTHERN TELECOM LIMITED See column 1 

line 50 to column 2 line 14. 

US 4972261 GENERAL ELECTRIC See column 1 lines 60-68. 





X Document indicating lack, of novelty or inventive step 
Y Document indicating lack of inventive step if combined 
with one or more other documents of same category. 

& Member of the same patent family 



A Document im^i technological background and/or state of the art 
? Document published on or after the declared priority datebutbefbre (be 

filing data of this invention. 
E Patent document published ea or after, but with priority date earner 

than, (be filing data of Ibis application. ' '* 



An Executive Agency of the Department of Trade and Industry 



This Page Blank (usotoi 



Patents Form 1/77 

Patent* Act 1977 

(Bute 16) 



Request for grant of a patent 

($CC the nOt*?9 on the hack of this farm. Ymi CU af*0 g** «H 

aplmxtM tory leaflet Eru*n the Pmtait Office to help you in 



16HAY00 Eo36S92-l D027it 
P01/7700 0.00-0011606,1 



1. Your reference 



t 

\ 




; V'i. -» • * * 



\ 1 — 



Tbfc Patent Ol 

Cardiff Road 
Newport 
Gwent NF9 1R1 



PAT 00407 GB 



2. Patent application number 

(The Palsra Office wiU JUt in Uris pan} 



0011606.1 



5 MAY 2ni« 



3. Full name, address and postcode of the or of 

pqrh applicant (underline all surnames} 

Patents ADP number Of you know a} 

If the applicant is a corporate body, give the 
country/state of its incorporation 


NOKIA MOBILE PHONES LIMITED 
KEILALAHDENTIE 4 
02150 ESPOO 
FINLAND 

. ^ 

^ 1 i I 1 1 ~*> — ^ ' p 

FINLAND 


4. Title of the Invention 


VIDEO CODING 


5 . Name of your agent (if you tx*oe one} 

"Address for service" in the United Kingdom 
to which all correspondence should be sent 
(including the postcode} 

Patents ADP number (if you know to 


NOKIA IPR DEPARTMENT 
NOKIA HOUSE 
5UMMIT AVENUE 
FARNBOROUGH 
HAMPSHIRE 

GU140NG UK y 
7577638001 \S 


6. If you are declaring priority from one or more 
earlier patent applications, give the country 
and the date of filing of the or of each of these 
earlier applications and (if you know ts) the or 
each application number 


Country Fnoiliy application number Date of filing 

(if you know m (day / ntonto /year) 


7. If this application is divided or otherwise 
derived from an earlier UK application, 
give the number and the filing date of 
the earlier application 


Number of rar»^ application Date of flUng 

(day / month /year} 


8. Is a statement of inventorship and of right 
to grant of a patent required in support of 
this request? (Answer ires' & 

iO €tny applicant natnad tn part 3 is not an Inventor, or 
0) Were U an inventor who is not named as an 
cfeg*. ^applicant, or 

K^g^rotty nsmtsd appttcant is a corporate body. 


YES 

Patents Form 1/77 



SCO d 



080998 ZSZl frfr* = XVd 

t-Bl-M SEd 



Mfl SlN31Vd dWN 5£ = 8 T (NOW) 00 . 7.VW- 

080596 ZSZ I m 92:81 00-90-EI 



Xjjsnpui Pus 3 P 6J XJ° JuawyedsQ sin jo XDusSv^Aunoaxg uy 



Up; 



Patents Form 1/77 



33HdO lN31Vd 3HJ.-01 



Enter the number of sheets for any of the 
Allowing items you are filing with this form 
DO not count copies of the same document 

c °M imwismjih.fifita-pf this form 

THE PATENT OFFICE* 



A 



1 6 KAY "nn 

RECEIVED BY FAX 



Description 

Claims 
Abstract 



10. If you are also filing any of the following, 
state how many against each item. 

Priority documents 

Translations of priority documents 

Statement of inventorship and light 
to grant of a patent (Totems form 7S77J 

Request for preliminary examination 
and search (raccnis Perm 9/77) 

Request for substantive examination 
(Taientsfiorm 10/77} 

Any other documents 

(Phase specify) 



11. 



12. Name and daytime ceJepnone number of 
person to contact in the United Kingdom 



Warning 

After an application for a patent has been filed, the 
or communication of the invention should be prohi 



United Klngd tM f^ apatm lrZVh.Z^Z Z apP [,T" baS bfien f«*« «' "we G weeks beforeh* 
Notes 

O) If you need help to fill in this form or you h me 
f>J Write your answers in capital letters using bled, 
CJ If there is not enough space for aU the relevant « 

sheet of paper and tvrite "see continuation sheet 

attached to this form. 

dj If you have answered "Yes- Patent, form 7/77 
e) Once you have filled in the form you must rem* 



fXnjkKrtenutssBfthcfee and mays to pay please £n 



'7v 



200 'd 



080598 rsn t>t>t:XVd 



0B0S3B iqii m-WOJj 



ZV:BD DD-AE^-gi P bai 



3 

I 

"7 



I/Werec 



t the srant of a P «cnt on the baais of this application. 
.JULIET HjBB^T_ iaCo5.?nno 



Miss J Hibbert 01252 865101 



ZTdTJZrtte^ W<U "better publication 



udtt be informed if it is nectary to prohibit ^stZyoZ^^n^Zf^™ ** ^ PatetUS Act 1977 r ° 
Untied Kingdom, Secttan 23 of the Patents Act IS>7?st^Z"Z^Z'?^ Way - ™" be ™>re. ttyou live in the 
written permission from rb« Patent Office unless an 



ZiaTtZZ tT b Zl /<(ea at ^ G WCekS beforehand tie * 



^ny questions, please contact the Patent Office on OtSdS 50050S. 
ink or you may type them. 

tf in the relevant partCSJ. Any continuation sheet should be 



la need to be filed 



tereo 



^ ana date it. 
it<?Sat Office. 




1/77 



M SINHlVd dWN Efr = 60 (3fU) 00 , ' IVW-9 



nc297BBc.doc 



VIDEO CODING 



PAT 004O7 



10 



15 



This invention relates to video coding. 

A video sequence consists of a series of still pictures or frames. Video 
compression methods are bassd on reducing the redundant and perceptually 
irrelevant parts of video sequences. The redundancy in video sequences can 
be categorised into spectral spatial and temporal redundancy. Spectral 
redundancy refers to the similarity between the different colour components of 
the same picture. Spatial redundancy results from the similarity between 
neighbouring pixels in a pictur e Temporal redundancy exists because objects 
appearing in a previous imago are also likely to appear in the current image. 
Compression can be achieved by taking advantage of this temporal 
redundancy and predicting the current picture from another picture, termed an 
anchor or reference picture. Further compression is achieved by generating 
motion compensation data tfat describes the motion between the current 
picture and the previous pictu 



■e. 



20 



compression 



tie 
pE.rts 
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However, sufficient 
the inherent redundancy of 
reduce the quality of those 
less important. In addition, 
reduced by means of efficien 
coefficients. The main technii 



cannot usually be achieved by only reducing 
sequence. Thus, video encoders also try to 
of the video sequence which are subjectively 
the redundancy of the encoded bit-stream is 
lossless coding of compression parameters and 
ue is to use variable length codes. 
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Video compression methods 
temporal redundancy redact 
that do not utilise temporal 
INTRA or 1-frames or l-pictlire 
forwardly predicted from a picture 
called INTER or P-frames 



typically differentiate between pictures that utilise 
on and those that do not. Compressed pictures 
redundancy reduction methods are usually called 
is. Temporally predicted images are usually 
occurring before the current picture and are 
n the INTER frame case, the predicted motion- 
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compensated picture is rare 
compressed prediction error 
INTER pictures may contain ll\ 



y precise enough and therefor© a spatially 
rame is associated with each INTER frame. 
TRA-coded areas. 



Many video compression schemes also use temporally bi-directionally 
predicted frames, which are commonly referred to as B-pictures or B-frames. 
B-pictures are inserted between anchor picture pairs of I- and/or P-frames and 

or both of these anchor pictures. B-pictures 
normally yield increased corrjipression as compared with forward-predicted 
pictures. B-pictures are not used as anchor pictures, i.e., other pictures are 
not predicted from them. Therefore they can be discarded (intentionally or 
unintentionally) without impacting the picture quality of future pictures. Whilst 
B-pictures may improve compression performance as compared with P- 
pictures, their generation requires greater computational complexity and 
memory usage, and they introduce additional delays. This may not be a 

ications such as video streaming but may cause 
ons such as video-conferencing. 



problem for non-real time app 
problems in real-time applicat 



A compressed video clip typ 
can be roughly categorised 
temporally differentially 
efficiency in INTRA pictures 
pictures are used sparingly, 



cally consists of a sequence of pictures, which 
i(ito temporally independent INTRA pictures and 
INTER pictures. Since the compression 
s normally lower than in INTER pictures, INTRA 
especially in low bit-rate applications. 



coc ed 



stream. 
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A video sequence may consist of a number of scenes or shots. The picture 
contents may be remarkably different from one scene to another, and 
therefore the first picture o a scene is typically INTRA-coded. There are 
frequent scene changes in lelevision and film material, whereas scene cuts 
are relatively rare in video conferencing. In addition, INTRA pictures are 
typically inserted to stop temporal propagation of transmission errors in a 
reconstructed video signal and to provide random access points to a video bit- 
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Compressed video is easily cu 
reasons. Firstly, due to utilis 
(INTER frames), an error is 
practice this means that, once 
eye for a relatively long time 
bit-rates where there are only 
propagation is not stopped 
length codes increases the su 
codeword, the decoder will 
subsequent error-free 
next synchronisation (or start; 
which cannot be generated 
and such codes are added 
synchronisation. In addit 
transmission. For example, 
transport protocol in IP 
encoded video bit-stream 



codewords 



networks 



20 There are many ways for the 



are first detected and then 



25 



rrupted by transmission errors, mainly for two 
ation of temporal predictive differential coding 
Dropagated both spatially and temporally. In 
an error occurs, it is easily visible to the human 
Especially susceptible are transmissions at low 
a few INTRA-coded frames, so temporal error 
some time. Secondly, the use of variable 
;ceptibility to errors. When a bit error alters the 
codeword synchronisation and also decode 
(comprising several bits) incorrectly until the 
code. A synchronisation code is a bit pattern 
f|-om any legal combination of other codewords 
to the bit stream at intervals to enable re- 
errors occur when data is lost during 
in video applications using the unreliable UDP 
, network elements may discard parts of the 



for 



Icse 



ion, 



receiver to address the corruption introduced in 



the transmission path. In general, on receipt of a signal, transmission errors 



corrected or concealed by the receiver. Error 
correction refers to the process of recovering the erroneous data perfectly as 
if no errors had been introduced in the first place. Error concealment refers to 
the process of concealing the effects of transmission errors so that they are 
hardly visible in the reconstructed video sequence. Typically some amount of 
redundancy is added by the source or transport coding in order to help error 
detection, correction and corcealment. 
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There are numerous knowrji 
given by Y. Wang and Q. -F 
Communication: A Review" 



concealment algorithms, a review of which is 
Zhu in "Error Control and Concealment for Video 
Proceedings of the IEEE, Vol. 86, No. 5, May 
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1998, pp. 974 - 997 and an 
"Error Concealment in Encc 
Selected Areas in Communications 



Current video coding standards 
stream. The most popular 
Recommendation H.263, 
February 1998; ISO/lEC 1 
Part 2: Visual", 1999 (known 
(ISO/lEC 13818-2) (known as 
for bit-streams and correspond 



by P. Salama, N. B. Shroff, and E. J. Delp, 
ded Video," submitted to IEEE Journal on 



define a syntax for a self-sufficient video bit- 
standards at the time of writing are ITU-T 
Vifcleo coding for low bit rate communication", 
44$6-2, "Generic Coding of Audio-Visual Objects. 
MPEG-4); and ITU-T Recommendation H.262 
MPEG-2). These standards define a hierarchy 
ingly for image sequences and images. 



as 



In H.263, the hierarchy has fc ur layers: picture, picture segment, macroblock, 
and block layer. The picture Iztyer data contain parameters affecting the whole 
picture area and the decoding of the picture data. Most of this data is 
arranged in a so-called picturo header. 



can 



The picture segment layer 
layer. By default, each picture 
blocks (GOB) typically com 
GOB consists of an optional 
the optional slice structured 
instead of GOBs. A slice contain 
scan-order. Data for each 
the macroblocks. 



slii;e 



either be a group of blocks layer or a slice 
is divided into groups of blocks. A group of 
dnses 16 successive pixel lines. Data for each 
i BOB header followed by data for macroblocks. If 
node is used, each picture is divided into slices 
s a number of successive macroblocks in 
consists of a slice header followed by data for 



4* 



Each GOB or slice is divided into macroblocks- A macroblock relates to 16 x 
1 6 pixels (or 2 x 2 blocks) o1 luminance and the spatially corresponding 8x8 
pixels (or block) of chrominance components. A block relates to 8 x 8 pixels 
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Block layer data consist of 



jniformly quantised discrete cosine transform 
coefficients, which are scanned in zigzag order, processed with a run-length 
encoder and coded with variable length codes. MPEG-2 and MPEG-4 layer 
hierarchies resemble the one in H.263. 

5 

By default, these standards use the temporally previous anchor (I, El, P ( or 
EP) picture as a reference for temporal prediction. This piece of information is 
not transmitted, i.e. the bit-stroam does not contain information relating to the 
identity of the reference pictire. Consequently, decoders have no means to 
10 detect if a reference picture I* lost. Many transport coders packet video data 
in such a way that they associate a sequence number with the packets. 
However, these kinds of sequence numbers are not related to the video bit- 
stream. For example, a section of a video bit-stream may contain the data for 
P-picture P1 , B-picture B2, P-picture P3, and P-picture P4 captured (and to be 
15 displayed) in this order. However, this section of the video bitstream would be 
compressed, transmitted, and decoded in the following order: P1 , P3, B2, P4 
since B2 requires both P1 and P3 before it can be encoded or decoded. Let 
us assume that there is one packet per one picture and each packet contains 
a sequence number. Let us further assume that the packet carrying B2 is lost. 
20 The receiver can detect ttjns loss from the packet sequence numbers. 

no means to detect if it has lost a motion 
compensation reference picture for P4 or if it has lost a B-picture, in which 
case it could continue decoding normally. 



25 The decoder therefore usual 
freezes the picture on the 
to respond to this request, 
application, the transmitter 
decoder. Therefore the 

30 frame is received. In a real 
transmitter may not be 
conference, the encoder 



y sends an INTRA request to the transmitter and 
5play. However the transmitter may not be able 
For instance in a non-real-time video streaming 
cannot respond to an INTRA request from a 
r freezes the picture until the next INTRA 
I time application such as video-conferencing, the 
to respond. For instance, in a multi-party 
not be able to respond to individual requests. 



decoder 



able 



may 
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Again the decoder freezes th£ picture until an INTRA frame is output by the 
transmitter. 



cf 



representing 



According to a first aspect 
encoding a video signal 
comprising receiving a current 
prediction of the current picturje 
picture, comparing the default 
reference picture, calculating 
reference picture and each 
similarity meets a pre-detenriined 
the further reference picture 
prediction of the current frami 



the invention there is provided a method of 
a sequence of pictures, the method 
picture for encoding, forming a temporal 
from a default reference picture for the current 
reference picture with at least one further 
a measure of the similarity between the default 
Jrther reference picture and, if the measure of 
criterion, outputting an indicator identifying 
and associating the indicator with the temporal 



The indicator may be termed a spare reference picture number since the 
indicator indicates to a decoder which reference picture(s) resemble the 
default reference picture. This "spare" reference picture may be used by a 
decoder to decode the current frame if the default reference picture is lost for 
some reason. 

The spare reference picture number may be in respect of the whole picture or 
part of a picture. In the foimer case, typically the spare reference picture 



number is included in a pictu 
picture number is included 



Preferably the method also 
current picture from a first 



J re header. In the latter case the spare reference 
in the picture segment headers or macroblock 
headers of the picture. In e preferred implementation of the invention, the 
video signal is encoded according to the H.263 standard and the indicator is 
included in the Supplementa Enhancement Information. 



reference picture for the current picture, said first default reference picture 
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comprises forming a temporal prediction of the 
default reference picture and a second default 



occurring temporally before i^^^&e.^rrent picture and said second default 
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reference picture occurring 
first default reference pictu 
occurring temporally before 
similarity between the first 
reference picture and, if the 
criterion, outputting an i 
associating the indicator with 



10 



15 



20 



25 



temporally after the current picture, comparing the 
with at least one further reference picture 
ttle current picture, calculating a measure of the 
default reference picture and each further 
measure of similarity meets a pre-determined 
indicator identifying the further reference picture and 
he temporal prediction of the current frame. 



Thus an indicator is provided for forwardly predicted frames but not for 
backwardly predicted frames. 

Preferably the default refererce picture is compared with a plurality of further 
reference pictures and an indicator is output for each further reference picture 

criterion. Advantageously the further reference 
pictures that meet the predetermined criterion are ranked in order of similarity 
and the indicator is associated with the temporal prediction of the current 
frame in order of rank, th<i further reference picture having the closest 
similarity to the default refereice picture being placed first. 
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According to a second aspect 
decoding an encoded video 
encoded signal including 
temporal prediction of a current 
current picture, the method 
representing a current pictujre 
current picture, and applying 
as appropriate, wherein if \h 
the decoder examines an 
decodes the current picture 
such an indicator is associated 



of the invention there is provided a method of 
signal representing a sequence of pictures, the 
p ctures that have been encoded by forming a 
picture from a default reference picture for the 
comprising receiving an encoded video signal 
!, decoding at least the picture header of the 
error correction and error concealment methods 
e reference picture of the current picture is lost, 
indicator identifying a further reference picture and 
with reference to said further reference picture if 
with the current picture 
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According to a third aspect 
comprising an input for rece 
pictures, an input for receiving 
forming a temporal prediction 
picture for the currant pictu 
picture with at least one furfche 
measure of the similarity 
further reference picture an 
determined criterion, outputt 
picture and associating the 
current frame. 



the invention there is provided a video encoder 
wing a video signal representing a sequence of 
a current picture for encoding, means for 
of the current picture from a default reference 
re, means tor comparing the default reference 
r reference picture, means for calculating a 
the default reference picture and each 
, when the measure of similarity meets a pre- 
:iig an indicator identifying the further reference 
indicator with the temporal prediction of the 



between 



of 



receiving 



According to a fourth aspect 
comprising an input for 
sequence of pictures, the e 
encoded by forming a tempdral 
reference picture for the cun 
receiving an encoded video 
decoding at least the picture 
correction and error concealment 
reference picture of the curtent 
examine an indicator identify 
current picture with reference 
indicator is associated with 



The invention also relates to 
encoder and/or a decoder as 



to the accompanying drawing 
Figure 1 shows a multimedia 
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the invention there is provided a video decoder 
an encoded video signal representing a 
ricoded signal including pictures that have been 
prediction of a current picture from a default 
ent picture, the decoder comprising an input for 
signal representing a current picture, means for 
header of the current picture, and applying error 
methods as appropriate, wherein when the 
picture is lost, the decoder is arranged to 
ng a further reference picture and to decode the 
to said further reference picture if such an 
current picture 



the 



a radio telecommunications device including an 
described. 



The invention will now be described, by way of example only, with reference 



31d03d m HOXSHAMI 



in 



SfrO 'd 



EH -qop 



080S98 ZSZI n+ = XVd 



is, in which: 

mobile communications system; 



O 

o 



'9a . stf 




Mfl SXN31VJ dWN 

060599 ZSZl m 



-to 



82^81 (NOW) 00 , 'AVW-S 

52=81 00-SO-S1 



25 



nc2s788c.doc 



30 



of the multimedia components of a multimedia 



Figure 2 shows an example 
terminal; 

Figure 3 shows an example of a video codec; 
Figure 3a shows a more derailed view of a video encoder according to the 
5 invention; 

Figure 4 illustrates the operation of a first embodiment of a video encoder 
according to the invention; 
Figure 5 illustrates the operation of a second implementation of a video 
encoder according to the inve ntion; 
10 Figure 6 shows the syntax of a bit stream as known according to H.263; 

Figure 7 shows a first example of a bit stream output by an encoder according 
to the invention; 

Figure 8 shows a second example of a bit stream output by an encoder 
according to the invention; 
15 Figure 9 shows a third example of a bit stream output by an encoder 
according to the invention; 
Figure 10 illustrates enhan:ement layers used in video coding for SNR 
scalability; and 

Figure 11 illustrates enhancement layers used in video coding for spatial 
20 scalability. 



Figure 1 shows a typical 
multimedia mobile terminal 1 
terminal 2 via a radio link 3 
data is sent between the two 



Figure 2 shows the typical 
terminal comprises a video 
manager 30, a control 
modem 60 (if the required) 
from a video capture device 
receives signals for decodi 



multimedia mobile communications system. A first 
communicates with a second multimedia mobile 
to a mobile communications network 4. Control 
terminals 1.2 as well as the multimedia data. 



multimedia components of a terminal 1. The 
codec 10, an audio codec 20, a data protocol 
marhager 40, a multiplexer/demultiplexer 50 and a 
The video codec 10 receives signals for coding 
of the terminal (not shown) (e.g. a camera) and 
rlig from a remote terminal 2 for display by the 
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terminal 1 on a display 70 
from the microphone (not 
decoding from a remote 
of the terminal 1. The terrh 
device, such as a radio telep lone 



10 

The audio codec 20 receives signals for coding 
s Town) of the terminal 1 and receive signals for 
terminal 2 for reproduction by a speaker (not shown) 
inal may be a portable radio communications 



co itro 



data 



The control manager 40 
audio codec 20 and the 
invention is concerned with 
discussion of the audio codec 



The video codec comprises 
The encoder part 100 compr 
a camera or video source 



transformer 103, a quantise 
transformer 1 09, an adder 1 
for more detail), a subtract© 



Is the operation of the video codec 10, the 
protocols manager 30. However, since the 
the operation of the video codec 10, no further 
20 and protocol manager 30 will be provided. 



Figure 3 shows an example of a video codec 10 according to the invention. 



an encoder part 100 and a decoder part 200. 
ses an input 101 for receiving a video signal from 
(not shown) of the terminal 1. A switch 102 
switches the encoder between an iNTRA-mode of coding and an INTER- 
mode. The encoder part 100 of the video codec 10 comprises a DCT 

104, an inverse quantiser 108, an inverse DCT 
0, a plurality of picture stores 107 (see Figure 3a 
' 106 for forming a prediction error, a switch 113 



The operation of an encoder 
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and an encoding control manager 105. 

The decoder part 2O0 of the video codec 1 0 comprises an inverse quantiser 
120, an inverse DCT transfcrmer 121, a motion compensator 122, a plurality 
of picture stores 123 and a controller 124. The controller 124 receives video 
codec control signals demultiplexed from the encoded multimedia stream by 
the demultiplexer 50. In practice the controller 105 of the encoder and the 
controller 124 of the decoder may be the same processor. 



according to the invention will now be described. 



The video codec 10 receive; ;^\£id<|(^ signal to be encoded. The encoder 100 

% 
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of the video codec e 
transformation, quantisation 
data is then output to the 
video data from the video 
well as other signals as apprb 
outputs this multimedia signs 
required). 



11 

nccbdes the video signal by performing DCT 
and motion compensation. The encoded video 
multiplexer 50. The multiplexer 50 multiplexes the 
1 0 and control data from the control 40 (as 
priate) into a multimedia signal. The terminal 1 
I to the receiving terminal 2 via the modem 60 (if 



cc dec 



quanti ses 



a I so 



In INTRA-mode, the video si 
efficients by a DCT transforrr 
the quantiser 1 04 that 
quantiser 104 are controlletjj 
video codec, which may 
terminal 2 by means of the 
formed by passing the 
quantiser 108 and applying 
quantised data. The resulting 
store 1 07 by the adder 110. 



data 



In INTER mode, the switch 
the difference between the 
which is stored in a picture 
subtracter 106 represents 
the reference picture stored 
may generate motion 
107 in a conventional manne r 



nal from the input 101 is transformed to DCT co- 
er 1 03. The DCT coefficients are then passed to 
the coefficients. Both the switch 102 and the 
by the encoding control manager 105 of the 
receive feedback control from the receiving 
control manager 40. A decoded picture is then 
output by the quantiser through the inverse 
an inverse DCT transform 109 to the inverse- 
data is added to the contents of the picture 



02 is operated to accept from the subtracter 106 
gnal from the input 101 and a reference picture 
store 107. The difference data output from the 
prediction error between the current picture and 
in the picture store 107. A motion estimator 111 
compensation data from the data in the picture store 



sic 



tha 



The encoding control manager 105 decides whether to apply INTRA or INTER 
coding or whether to code the frame at all on the basis of either the output of 
the subtractor 106 or in response to feedback control data from a receiving 
decoder. The encoding conxol manager may decide not to code a received 
frame at all when the similarity between the current frame and the reference 
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frame is so high or there is 
manager operates the switch 



not time to code the frame. The encoding control 
102 accordingly. 



feedback 



When not responding to 
a frame as an INTRA-frame 
being P-frames), or at reguldr 
subtractor exceeds a threshold 
the picture store 107 are jucjged 
be programmed to encode 
PBBPBBPBBPBBIB 



control data, the encoder typically encodes 
either only at the start of coding (all other frames 
periods e.g. every 5s, or when the output of the 
i.e. when the current picture and that stored in 
to be too dissimilar. The encoder may also 
frames in a particular regular sequence e.g. I B B 
B P etc. 



The video codec outputs the 
index 112b (i.e. the details 
1 12c to indicate the mode of 
to indicate the number of the 
for the picture being coded 
50 together with other multimedia 



described 



The encoder 100 will be 
shows a simplified view of trie 
100 comprises a plurality of 
example seven picture store; 5 
two or more. 



Consider an encoder that 
format IBBPBBPBBP 
assume that the encoder 
frames will be skipped. This 



12 



quantised DCT coefficients 112a, the quantising 
of the quantising used), an INTRA/INTER flag 
coding performed (I or P/B), a transmit flag 1 12d 
frame being coded and the motion vectors 1 12e 
These are multiplexed together by the multiplexer 
signals. 



further with reference to Figure 3a, which 
encoder 100 of the video codec. The encoder 
picture stores 107a-1Q7g. Although in this 
are shown, the number of picture stores may be 



is arranged to encode an input signal with the 
BPBBPBBPBBI etc. For simplicity we will 
encode every frame of the input signal i.e. no 
is illustrated in Figure 4. 



wll 
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the order 0,1,2,3.4,5,6 etc. 
frames are displayed in the 
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30 As mentioned earlier, the fre.mes are received from a video capture device in 



and are displayed in this order i.e. the decoded 
0 ptaer^JO,B1 J B2,P3,B4 l B5 l P6 etc. However the 
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video bit stream is compre 
order 10, P3, B1, B2, PB, B4 
preceding and succeedi 
encoded/decoded i.e. 
encoded/decoded before it 



13 

ssed, transmitted and decoded in the following 
B5 etc. This is because each B-frame requires 
ng reference frames before it can be 
frar|ne B1 requires frame 10 and P3 to be 
be encoded/decoded. 



can 



rece i ved 



When the first frame is 
the switch 102 is placed intc^ 
controller 105 so that the 
signal is DCT transformed 
macroblock basis. The res 
108 and inverse DCT 109 
open, The output of adder 1 
purpose switch 11 4a is 
frame store 107a holds a 



, all of the picture stores 107 are empty and 
the INTRA mode under control of the encoding 
inbut signal is encoded in INTRA format. The input 
eind quantised. This is done on a macroblock by 
i|lting signal is then decoded by inverse quantiser 
Since the frame is INTRA coded, switch 113 is 
1 0 is input to the first picture store 1 07a. For this 
whereas switches 114b-g are open. Thus 
decoded version of reference picture 10. 



clo sed 



code d 



The next picture to be 
from )0. Therefore when 
INTER mode, the output swi 
(i.e. switch 115a) is closed 
store 107a are subtracted 
having been calculated in 
then encoded by DCT 1 
quantiser 108 and IDCT 1 
closed and switch 114b 
Thus adder 110 adds the 
store 107a and stores the 



closed 



The next frame to be codec 
Thus the contents of both 
available to the subtractor 1 



is frame 3, which is to be forwardly predicted 
fraifne 3 is input at 101 , switch 102 is changed to the 
ch 115 of the most recent reference picture store 
and the motion compensated contents of picture 
from the input signal, motion compensation data 
the conventional manner. This prediction error is 
03 and quantiser 104 and decoded by inverse 
Q9. The switch 113 is then closed, switch 115a 
(the other switches 114 and 115 being open), 
decoded picture to the picture as stored in picture 
result in picture store 107b. 



is frame 2, which is to be coded as a B-frame. 
of the frame stores 107a and 107b are made 
06 in a conventional manner. Since B-frames do 
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not form a reference picture 
decoded and stored in a 



reference 



Thus in the case described 
107g contain decoded versions 
respectively. 



nbove, after 19 frames, the frame stores 107a to 
of frames io, P3. P6, P9, P12, P15 and P18 



In the invention, when the 
the encoding control manager 
Reference Picture Number 
with the P and B frames of a 



Encoders can use this messjage 
or pictures resemble the curr|ent 
used as a spare reference picture 
transmission. 



When frame 3 is encoded 
frames are stored in the 
is associated with frame 3 
encoded with reference to 
the reference picture stores 
either of these frames. 



14 

for any other frame, the encoded B-frame is not 
picture store. 



encoder encodes a frame in a predictive manner, 
105 may associate with the frame a Spare 
(jfcRPN). For example, a SRPN may be associated 
video signal but not with the l-frames. 



to instruct decoders which reference picture 
reference picture, so that one of them can be 
if the actual reference picture is lost during 



with reference to frame 0, no other reference 
picture stores 107a-g. Therefore no SRPN 
Similarly when frames 1 and 2 are bi-directionally 
0 and 3, there are no other frames held in 
107a-g. Therefore no SRPN is associated with 



ref e rence 



30 



25 However when frame 6 is Forwardly predicted from frame 3 (the decoded 
version of which is stored in picture store 107b) there is also a copy of frame 
10 in picture store 107a. The encoder calculates the similarity between the 
default reference picture of the current frame (i.e. frame 3 for frame 6) and the 
contents of the other popula ed picture stores i.e. picture store 107a. If these 
two reference pictures are sufficiently similar (e.g. the correlation between the 
contents of frame store 107 a and 107b is above a threshold), the encoder 



associates a SRPN with the 
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sufficient, no SRPN is associated with frame 6. The SRPN identifies frame 0 
as a spare reference picture. 



whon 



fra Tie 



As will be appreciated, 
decoded versions of frames 
respectively. By default, 
stored in picture store 107e. 
correlation between the date 
the other populated picture 
store (and hence the refe 
the contents of picture store 
current frame being coded, 
data that indicates the i 
to the Temporal Reference 
below. 



frame 15 is to be encoded (as a P-frame), 
, 3, 6, 9 and 12 are held in picture stores 1 14a-e 
15 is encoded with reference to frame 12 as 
The encoder also carries out a calculation of the 
in the picture store 107e and the data stored in 
^tores 107a-d. The encoder identifies the picture 
picture) that has the closest correlation with 
107e i.e. with the default reference picture for the 
The encoder then adds a SRPN to the encoded 
reference picture. This SRPN can be equal 
of the reference picture as will be described 



re nee 



dentified 



More than one SRPN may 
SRPN are ordered within the 
similar reference picture (oth 



manage 



The encoding control 
1 12f which indicates the Spa 
encoded frame. This is mu 



represents 



ths 



Figure 4 illustrates the 
first line of Figure 4 
input device and input to 
Figure 4 represents those 
to encode and the coding 
above, in this example the 
use the IBBP coding format 



15 



be associated with a frame. In this case, the 
picture header in the order of similarity, the most 
er than the default) being mentioned first. 



r 105 outputs this SRPN codeword on output 
re Reference Picture Number associated with the 
Itiblexed into the video bitstream by a multiplexer. 



operjation of a first embodiment of the encoder. The 
the frames of data received from a capture 
video coder on input 101. The second line of 
of the input signal that the encoder decides 
niode used to encode each frame. As mentioned 
encoder is arranged to encode every frame and to 



fnimes 
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Frame 0 is coded in |NTRJ<V 
reference to frame 0 and/o 
reference to frame 0 and 
reference to frame 0; fram|e 
frame 3 and/or 6; frame 5 i$ 
and/or 6; frame 6 is encoded 



The third line of Figure 4 sfyows 
encoded signal. In this embodiment 
and B-frames, as shown in 
frames of the encoded frarnles 
frames are not. 



The fourth line of Figure 
encoded frame. This is a 
formed by incrementing its 
header by one plus the num 
previously transmitted 
Figure 4 the TR shown for 
order of the frames in the 



Examples of possible value^ 
TR of the spare reference 
above. Although this exarr 
encoded picture, more than 
encoded picture, as described 



according to the invention. 
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mode; frame 1 is encoded as a B-frame with 
r 3; frame 2 is encoded as a B-frame with 
or 3; frame 3 is encoded as a P-frame with 
4 is encoded as a B-frame with reference to 
encoded as a B-frame with reference to frame 3 
as a P-frame with reference to frame 3; etc. 



a SRPN field associated with frames of the 
a SRPN is associated with the P-frames 
the third line of Figure 4. The P-frames and B- 
are temporally predictively encoded and the I- 



4 shows the Temporal Reference (TR) of the 
field included in H.263 and the value of TR is 
value in the temporally previous reference picture 
aer of skipped or non-reference pictures since the 
picture. Thus in the example shown in 
each frame is the same as the original temporal 
original signal input to 101. 



refeience 



of SRPN are shown. These values indicate the 
rame as identified by the encoder as described 
pie shows only one SRPN for each predictively 
one may be associated with each predictively 
earlier. 



Figure 5 illustrates the ope ration of a second embodiment of an encoder 



In this embodiment, the encoder is arranged to 



code the frames according to the regular sequence IBBPBBPBBPBBl 
B B P B B P. However, n the embodiment, a SRPN is associated with 
forwardly predicted frames ( 



(i^E^C^ries) only. 
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The first line of Figure 5 
the coded frames and their 
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shows the input frames and the second line shows 
coding mode, I, P or B. 



The third line of Figure 5 shfc>ws 
may be generated as discussed 



The fourth line of Figure 
encoded frame. As in the 
frame is the same as the 
signal input to 101. 



shows the Temporal Reference (TR) of the 
example shown in Figure 4, the TR shown for each 
original temporal order of the frames in the original 



as 



Considering the terminal 1 
operation of the video cod 
decoding role. The term 
transmitting terminal 2 
signal and passes the video 
to the control manager 40 
encoded video data by 
motion compensating the 
integrity of the received datai 
the error in a manner to 
concealed video data is 
output for reproduction on a 



Errors in video data may 
macroblock level. Error 
levels- 



Considering first the signal 
to the invention receives th b 
conventional manner and tin 



the SRPN associated with P-frames. These 
above with reference to Figure 3a. 



receiving coded video data from terminal 2, the 
10 will now be described with reference to its 
nal 1 receives a multimedia signal from the 
demultiplexer 50 demultiplexes the multimedia 
data to the video codec 10 and the control data 
The decoder 200 of the video codec decodes the 
quantising, inverse DCT transforming and 
The controller 124 of the decoder checks the 
and, if an error is detected, attempts to conceal 
described below. The decoded, corrected and 
stored in one of the picture stores 123 and 
display 70 of the receiving terminal 1 . 



ec 



inverse 



data 



bo 
th 2n 



c ccur at the picture level, the GOB level or the 
checking may be carried out at any or each of these 



is shown in Figure 4, when a decoder according 
signal each frame of the signal is decoded in a 
en displayed on a display means. The decoded 
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frame may be error corrected 
Each time a frame is 
determine when the frame isj 



dec :oded 



In the case shown in Figure 
from its picture header that 
frame 0 without reference 
123a. The decoder then 
header that the frame is IN 
decodes frame 3 with refe 
stores it in the next picture 
and 2 with reference to 
stored in the picture stores 
reference frame for any other 



4 the decoder receives frame 0 and determines 
tie frame is INTRA-coded. The decoder decodes 
any other picture and stores it in picture store 
receives frame 3 and determines from its picture 
ER-coded as a P-frame. The decoder therefore 
'encB to the preceding reference frame 0 and 
ijtore 123b. The decoder then decodes frames 1 
0 and 3 and so on. These frames are not 
23 since , as B-pictures, they are not used as a 
frame. 



frames 



Let us now assume that th 
could be due to the data be 
that the next frame received 
SRPN=0. As frame 9 (one 
not decoded by the decoder 
the received frame for the 



frame 8. The decoder the 
reference to frames 9 and 



reference frame 6, stored in 
forward direction, rather than 
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and error coded in a conventional manner. 
. the decoder examines the TR field to 
to be displayed. 



decoder is unable to reconstruct frame 9 (this 
g greatly corrupted or being lost altogether) and 
by the decoder is frame 7, with TR=7, and 
of the default reference picture for frame 7) was 
the decoder looks for a SRPN in the header of 
backward prediction. 



However, frame 7 does not include a SRPN in the backward direction. 
Therefore the decoder is unable to decode frame 7. This is also the case for 

2n receives frame 10, which was encoded with 
1 2. Frame 9 was not decoded by the decoder. 



However frame 10 has SRFN=6. Therefore the decoder uses the decoded 



picture store 123c, to decode frame 10 in the 
frame 7. This is also true for frame 1 1 . 



The next frame to be received is frame 12, which was encoded with reference 
to picture 9 and has SRPN=^Sfn^ frame 9 was not decoded, the decoder 
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uses the reference picture 
store 123c) to decode frame 
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indicated by SRPN (i.e. frame 6 stored in picture 
12. 



The decoder may detect th 
ways, for instance information 
frame may be examined 
signal may be allocated a 
British patent application 



filed 



b omission of a reference frame in a number of 
relating to the temporal order of each encoded 
rnatively, the reference frames of an encoded 
number in a sequential order as described in a 
by the Applicant on even date. 



Alte 



If the decoder has the facility 
video encoder the decoder 
encoder to encode a frame 
propagation that would 
reference to frame 6. 
conventional manner. 



to send control feedback data to the transmitting 
can send a request to the transmitting video 
^s an INTRA-frame and so stop the temporal error 
result from frames 10 and 11 being decoded with 
decoder continues to decode the signal in a 



Tho 



When the decoder receives 
decodes frame 21 without 
decoded frame in picture 
and 20 with reference to 
have been introduced to 
frame 6 rather than frame 9, 
displayed picture is not helc 
may be more acceptable to 



Considering now the signal 
to the invention receives 
conventional manner and th 
frame may be error corrected 
Each time a frame is 
determine when the frame is 



frame 21, which is an INTRA frame, the decoder 
reference to any other frame and stores the 
stbre 123. The decoder then decodes frames 19 
frames 18 and 21. Even though some error may 
frime 18 by decoding frame 12 with reference to 
the resulting image should be acceptable and the 
frozen until an INTRA picture is received. This 
viewer. 



as shown in Figure 5, when a decoder according 
signal each frame of the signal is decoded in a 
en displayed on a display means. The decoded 
and error concealed in a conventional manner, 
the decoder examines the TR field to 
to be displayed. 



this 



decoded, 
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The decoder receives frame 
accordingly and stores it in pibtur 
decoder is unable to reconstruct 
greatly corrupted or being iosf 
decoded by the decoder is 
encoded with reference to frarjn 
is unable to reconstruct frame 1 
1 and 2 have been lost is of no 
does not form a reference picti j 
introduce any temporal error 
the signal in a conventional manner. 



20 

0, which is an INTRA frame and decodes it 
re store 123a. Let us now assume that the 
frame 3 (this could be due to the data being 
altogether) and the next frame received and 
frame 1. Frame 1 is a bi-directional frame 
ie 0 and 3. Since frame 3 is lost, the decoder 
and similarly frame 2. The fact that B-frames 
consequence to the decoder as the B-frame 
re for any other frame and thus its loss will not 
propagation. The decoder continues to decode 



The next frame received 
decoder knows that the 
(because it could not decode 
the header of the received 
frame 6 has a SRPN=0 and 
decode frame 6. 



If the decoder has the facility tc 
video encoder the decoder 
encoder to encode a frame as 
propagation that would result 
reference to frame 6 which wa^ 
the default frame 3. However 
not freeze the picture on the 



How the spare reference pictu 
signal will now be addressee 
standard. 
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anil decoded by the decoder is frame 6. The 
preceding reference picture P3 has been lost 
ame 1 or 2). The decoder therefore examines 
for an SRPN. The decoder determines that 
so uses frame 0 in the picture store 123a to 



frane 



can 



an 



send control feedback data to the transmitting 
send a request to the transmitting video 
INTRA-frame and so stop the temporal error 
from subsequent frames being decoded with 
decode with reference to frame 0 rather than 
the decoder can continue decoding and does 
display whilst it waits for an INTRA-coded frame. 



re number may be included in the encoded 
with reference to the H.263 video coding 
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Figure 6 shows the syntax of 
following implementations describe 
skilled person that the inventio t 



As mentioned already, the bit 
segment layer, macroblock 
a picture header followed by 
by any optional end-of-sequenoe 



lay sr 



data 



The prior art 
for each part 
PSC 
TR 



PTYPE 

PQUANT 

CPM 

PSBI 
TR B 

DBQUANT 
PEI 
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bit stream as known according to H.263. The 
the GOB format but it will be clear to a 
may also be implemented in the slice format. 



stream has four layers: the pjcture layer, picture 
and block layer. The picture layer comprises 
for the Group of Blocks, eventually followed 
code and stuffing bits. 



s formatted as shown in Figure 6. A descriptor 



GOBS 



H.263 bit stream 
is given below; 

The picture start code (PSC) indicates the start of the picture 
The Temporal Fleference (TR) is formed by incrementing its 
value in the temporally previous reference picture header by 
one plus the number of skipped or non-referenced pictures 
since the previously transmitted one 
Amongst other things, PTYPE includes details of the picture 
coding type i.e. INTRA or INTER 
A codeword that indicates the quantiser to be used for the 
picture until updsted by any subsequent quantiser information 
A codeword th at signals the use of optional continuous 
presence multipoint and video multiplex (CPM) mode 
Picture Sub-Bit s ream Indicator - only present if CPM is set 
Present if the frane is a bi-directionally predicted frame (known 
as a PB-frame) 

Present if a bi-directional frame 

This relates to extra insertion information and Is set to M 1 n to 
indicate the presence of the following optional data fields 
PSUPP and PEI. PSUPP and PEI are together known as 
Supplemental Enhancement Information, which is further 
defined in Annex L of H263. 

Is the data for the group of blocks for the current picture 
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EOS 

PSTUF 
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A stuffing codeword provided to attain byte alignment before 
EOS 

A codeword ind i 
picture 

A stuffing codeword to allow for byte alignment of the next 
picture start codv PSC 



The structure as shown in 
data field. PSBI is only presen 
only present if PTYPE indicate^ 
PLUSTYPE filed is present 
These issues are addressed in 



an d 



The following paragraphs outli 
output by an encoder according i 



encoder according to the first 



eating the end of the data sequence for the 



Figure 6 does not include the optional PLUSTYPE 
if indicated by CPM. TRB and DBQUANT are 
use of a so-called PB frame mode (unless the 
the used of DBQUANT is indicated therein), 
more detail in the H.263 specification. 



le possible implementations of the bit-stream 
to the first aspect of the invention. 



The spare reference picture number may be incorporated into a H.263 bit 
stream as follows. Figure 7 shows an example of a bit stream output by an 

mplementation of the invention. As shown in 



20 Figure 7, the bit stream incl udes a further codeword SRPN which is a 



codeword indicating the Spare 



Reference Picture Number. This is inserted by 



an encoder according to the invention, as described above. 



information may be present in 



Alternatively, the SRPN may be included in the Supplemental Enhancement 
Information PSUPP (see Annex L of H.263 and Figure 4). The supplemental 

the bit stream even though the decoder may 



not be capable of providing the enhanced capability to use it, or even to 
properly interpret it. Simply discarding the supplemental information is 
allowable by decoders unless a requirement to provide the requested 
capability has been negotiated ly external means. 
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If PEI is set to "1", then 9 bits 
then another PEI bit to indicate 
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ollow consisting of 8 bits of data (PSUPP) and 
if a further 9 bits follow and so on. 



The PSUPP data consists of a 
by a 4-bit parameter data size 
of function parameter data, o 
It is known to use this PSUPP 
to indicate a full-picture or 
or without resizing; to tag part 
the video stream for external 
video compositing. 



To implement the inventioi 
Information, a further FTYPE is 



pi 5 



Figure 8 illustrates the exam 
SEI of the picture header, 
specifies the size of the 
data i.e. the value of SRPN. 
determine whether a reference 



4-bit function type indication FTYPE, followed 
specification DSIZE followed by DSIZE octets 
p^onally followed by another FTYPE and so on. 
codeword to signal various situations such as: 
partBal-picture freeze or freeze-release request with 
cular pictures or sequences of pictures within 
jse; or to convey chroma key information for 



using the Supplemental Enhancement 
defined as Spare Reference Picture Number. 



T ie 



parameter 



where a parameter SRPN is included in the 
FTYPE is defined as SRPN. The DSIZE 
and the following octet is the parameter 
From this value a receiving decoder can 
picture has been lost. 



Alternatively, the information 
Enhancement Information as s 
Supplementary Enhancement 
Wenger, ITU-T Study Group 1 
1999. 



m ay 



be contained in the additional Supplemental 
^ecified in a "Draft of new Annex W; Additional 
Information Specification" P. Ning and S. 
Question 15 Document Q15-I-58, November 



In this draft proposal, FTYPE 
FTYPE is set, the picture message 
more octets representing mess 
is a message header with the 
and MTYPE. DSIZE is equal 



14 



is defined as "Picture Message". When this 
function indicates the presence of one or 
age data. The first octet of the message data 
structure shown in Figure 9 i.e. CONT, EBiT 
to the number of octets in the message data 
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corresponding to a picture mesjsage function, including the first octet message 
header. 



The continuation field CONT, 



5 associated with the picture message is part of the same logical message as 
the message data associated with the next picture message function. The 
End Bit Position field EBIT specifies the number of least significant bits that 
shall be ignored in the last me ssage octet. Further details of these fields can 



be found in Annex W. 



The field MTYPE indicates the 



they can be discarded without 



compressed video sequence al 



a' 
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f equal to 1 , indicates that the message data 



type of message. Various types of message 
are suggested in the draft of Annex W. According to the invention one type 
e.g. MTYPE 9 is defined as SF PN. The value of SRPN is defined in the octet 
following the message header. 

n a specific example, this message contains one data byte, i.e., DSIZE is 2, 
CONT is 0, and EBIT is 0. The message data bytes contain the Spare 
Reference Picture Number(s) of the spare reference pictures in preference 
order (the most preferred appearing first). 

The above description has made reference to encoded video streams in 
which bi-directionally predicted pictures (B-pictures) are encoded. As 
mentioned earlier, B-pictures are never used as reference pictures. Since 

mpacting the picture quality of future pictures, 



25 they provide temporal scalabi ity. Scalability allows for the decoding of a 



more than one quality level. In other words, a 



scalable multimedia clip can bts compressed so that it can be streamed over 
channels with different data retes and still be decoded and played back in 
real-time. 

Thus the video stream may be decoded in different ways by differing 
decoders. For instance, a deco^gf pgijn decide only to decode the I- and P 
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pictures of a signal, if this is tli 
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ie maximum rate of decoding that the decoder 
can attain. However if a decoder has the capacity, it can also decode the B- 
pictures and hence increase he picture display rate. Thus the perceived 
picture quality of the displayed picture will be enhanced over a decoder that 
5 only decodes the I- and P-pictures. 



Scalable multimedia is typically 
data. A base layer contains 
whereas enhancement layers 
layers. Consequently, the enhafrcement 



ordered so that there are hierarchical layers of 
basic representation of the multimedia clip 
contain refinement data on top of underlying 
layers improve the quality of the clip. 



Scalability is a desirable property 
environments. This property is 
constraints on bit rate, display 
complexity. 



for heterogeneous and error prone 
desirable in order to counter limitations such as 
resolution, network throughput, and decoder 



prvve 



with 



var ous 



Scalability can be used to im 
layered coding is combined 
prioritisation here refers to 
of service in transport, including 
channels having different error/ 
assigned differently. For 
channel with a high degree of 
may be transmitted through mo 



exam Die 



Generally, scalable multimedia 
efficiency than non-scalable coqJ 
as a scalable multimedia clif} 
bandwidth than if it had been 
equal quality. However, except 
temporally scalable B-frames in 



error resilience in a transport system where 
transport prioritisation. The term transport 
mechanisms to provide different qualities 
unequal error protection, to provide different 
oss rates. Depending on their nature, data are 
, the base layer may be delivered through a 
error protection, and the enhancement layers 
*e error-prone channels. 



coding suffers from a worse compression 
ing. In other words, a multimedia clip encoded 
with enhancement layers requires greater 
coded as a non-scalable single-layer clip with 
ions to this general rule exist, for example the 
video compression. 



390 d 



080598 ZSZl m:XVd 



Mfl SlN31Vd dWN frfr = 8I (NOW)OO ,'AVW-S: 

080998 ZSZl VH 6Z = 6L 00-50-61 



/Ciisnpuj pire spBJx jo jusuiiredsQ aip jo ^ouaSy^rjnoaxg uy 



10 



15 



20 



25 



30 



nc297eec,doc 



The invention may be applied 
For instance, in H.263 Annex 
signal-to-noise (SNR) scalability 



to other scalable video compression systems. 
O, two other forms of scalability are defined: 
and spatial scalability. 



Spatial scalability and SNR 
being the increased spatial 
example of SNR scalable 
implies the creation of multi 
coding errors, or diffe 
reconstruction. This is 
difference picture in an enh 
increases the SNR of the overa 



scalability are closely related, the only difference 
resolution provided by spatial scalability. An 
pictures is shown in Figure 10. SNR scalability 
bit streams. It allows for the recovery of 
between an original picture and its 
by using a finer quantiser to encode the 
ancement layer. This additional information 
II reproduced picture* 



rate 



jrencos 



achieved 



picture are explicitly defined 
sampling process from the 



SNR scaled picture. 
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Spatial scalability allows for t ie creation of multi-resolution bit streams to 
meet varying display requirements and/or constraints. A spatially scalable 

e 11. It is essentially the same as in SNR 
scalability except that a spatisl enhancement layer attempts to recover the 
coding loss between an up-sampled version of the reconstructed reference 
layer picture and a higher resolution version of the original picture. For 
example, if the reference layer has a quarter common intermediate format 
(QCIF) resolution, and the enhancement layer has a common intermediate 
format (GIF) resolution, the reference layer picture must be scaled accordingly 

i<er picture can be predicted from it. The QCIF 
standard allows the resolution to be increased by a factor of two in the vertical 
direction only, horizontal direction only, or both the vertical and horizontal 
directions for a single enhancement layer. There can be multiple 
enhancement layers, each increasing the picture resolution over that of the 
previous layer. The interpolation filters used to up-sample the reference layer 

n the H.263 standard. Aside from the up- 
reference to the enhancement layer, the 



processing and syntax of a spatially scaled picture are identical to those of an 
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In either SNR or spatial 
referred to as El- or EP-pictu 
predicted from a picture in the 
picture is referred to as an 
scalability, the reference 
enhancement layer. In some 
predicted, over-coding of 
enhancement layer, causing ar 
problem, forward prediction is 
that can be predicted in the fc 
layer picture or, alternatively, 
picture is referred to as an 
the average of the upwardly 
directional prediction for EP-pi 
prediction from the reference 
required. In the case of forwarc 
required. 



cases, 
stc tic 



However, if Annex N or Annex 
multiple reference pictures, the 
pictures, the message typically 
message is used for forward 
from the temporally correspond 
preferably is not used if the pict Lire 
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ty, the enhancement layer pictures are 
. If the enhancement layer picture is upwardly 
reference layer, then the enhancement layer 
Enhancernent-I (El) picture. In this type of 
r means the layer "below" the current 
, when reference layer pictures are poorly 
parts of the picture can occur in the 
unnecessarily excessive bit rate. To avoid this 
permitted in the enhancement layer. A picture 
rward direction from a previous enhancement 
upwardly predicted from the reference layer 
P (EP) picture. Note that computing 
forwardly predicted pictures can provide bi- 
ctures. For both El- and EP-pictures, upward 
er picture implies that no motion vectors are 
prediction for EP-pictu res, motion vectors are 



Enfiancement- 



and 



lay 



The SRPN field can be associated with P. PB, Improved PB, and 
Enhancement Layer (EP) pictut 



tes. 



J is in use and if the picture is associated with 
SRPN is not used. For PB and Improved PB 
concerns only the P-part. For EP pictures, the 
prediction, whereas upward prediction is done 
ing reference layer picture. This message 
s an I, El or B picture. 



mu 



If the encoder is capable of 
Annex O of H.263) each lay$ 
Numbers. This may be 



Iti-layer coding (for example as discussed in 
r has consecutive Spare Reference Picture 
associated with the enhancement layer number 
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(ELNUM) of the current pictujre 
incremented by one from the 
reference picture in the sanie 
pictures in the same enhancement 
and if Annex N or Annex U of M 
this occurrence as an indication 
approximately the same pictured 
share the same SRPN. 



The indicator may also indicat^ 
the current picture if at least 
may be multiple messages for 
non-overlapping rectangular 
of the picture, a decoder 
Preferably the decoder uses a 
picture type i.e. for an INTRA 
used and for an INTER picture 



part 



ar?a 



uses 



A specific example will now 
message, DSIZE shall be 6, 
data byte is equal to one (0000 
is included. If the first data by* 
a decoder that a SRPN is inclu 
the horizontal and vertical 
rectangle, and the width and 
bits each and expressed in 
example, an entire QCIF pictur^ 
9). 
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The Spare Reference Picture Number is 
corresponding number of the previous coded 
enhancement layer. However, if adjacent 
layer have the same temporal reference, 
.263 is in use, the decoder preferably regards 
that redundant copies have been sent of 
scene content, and all of these pictures then 



the SRPN for a specified rectangular area of 
of the area is not correctly received. There 
one picture each specifying the SRPN for a 
If the messages do not cover some areas 
any error concealment for those areas, 
concealment method that corresponds to the 
picture an INTRA error concealment method is 
an INTER error concealment method is used. 



be given. For each error concealment type 
CQNT shall be 0, and EBIT shall be 0. If the first 
0001), this indicates to a decoder that a SRPN 
k is equal to two (0000 0010), this indicates to 
Jed. The following four PSUPP octets contain 
location of the upper left corner of the specified 
height of the rectangle, respectively, using eight 
of 16 pixels (of luminance picture). For 
is specified by the four parameters (0, 0, 1 1 , 



units 



For picture formats having a width and height that is not divisible by 16, the 
specified area may extend to ths next larger size that would be divisible by 1 6. 
For example, an entire image having size of 160 x 120 pixels is specified by 
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the four parameters (0, 0, 10 
cross picture boundaries nor 
areas of the same picture. 



eve 



The invention may be i 
example MPEG-4 defines so 
data and is not necessarily 
be added to these fields. 
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8). Preferably, the specified area does not 
rlap with other specified error concealment 



mplerriented in other video coding protocols. For 
called user data, which can contain any binary 
with a picture. The additional field may 



ass aerated 



10 The invention is not intended to be limited to the video coding protocols 
discussed above: these are intended to be merely exemplary. The invention 
is applicable to any video coc ing protocol using temporal prediction. The 
addition of the information as discussed above allows a receiving decoder to 
determine the best cause of action if a picture is lost. 

15 
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CLAIMS 



1. A method of encoding 
pictures, the method compris 
forming a temporal prediction 
picture for the current picture, 
least one further reference pi 
between the default reference 
if the measure of similarity 
indicator identifying the further 
with the temporal prediction of 



30 



a video signal representing a sequence of 
ng receiving a current picture for encoding, 
3f the current picture from a default reference 
comparing the default reference picture with at 
sture, calculating a measure of the similarity 
Picture and each further reference picture and, 
meets a pre-determined criterion, outputting an 
reference picture and associating the indicator 
1 he current frame. 



2. A method according to 
prediction of the current pictur 
second default reference 
reference picture occurring 
second default reference pictu 
comparing the first default 
picture occurring temporally 
of the similarity between the 
reference picture and, if the 
criterion, outputting an indicator 
associating the indicator with th 



3. A method according to 
default reference picture with 
outputting an indicator for 
predetermined criterion. 



A method according to 



HldOSd MI MOXSHANI 



slaim 1 further comprising forming a temporal 
e from a first default reference picture and a 
pictjre for the current picture, said first default 
temporally before the current picture and said 
occurring temporally after the current picture, 
picture with at least one further reference 
the current picture, calculating a measure 
default reference picture and each further 
Measure of similarity meets a pre-determined 
identifying the further reference picture and 
e temporal prediction of the current frame. 



rs 



reference 



before 



f rst 



claim 1 or 2 further comprising comparing the 
a plurality of further reference pictures and 
eajch further reference picture that meets the 



slaim 3 further comprising ranking the further 



reference pictures that meet the predetermined criterion and associating the 
indicator with the temporal prec^^ton- ©J^the current frame in order of rank, the 
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further reference picture having the closest similarity to the default reference 
picture being placed first. 



5. A method according to 
included in a picture header. 



6. A method according to c 
encoded according to the h 
indicator is included in the Supplemental 



incj 



encoded 



representing 



7. A method of decod 
sequence of pictures, the 
encoded by forming a tempordl 
reference picture for the currerjit 
encoded video signal 
picture header of the current 
concealment methods as appropriate 
current picture is lost, the 
reference picture and decodes 
further reference picture if suqh 
picture 
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any preceding claim wherein the indicator is 



ny preceding claim wherein the video signal is 
263 video compression standard and the 
Enhancement Information. 



pictu 



an encoded video signal representing a 
signal including pictures that have been 
prediction of a current picture from a default 
picture, the method comprising receiving an 
a current picture, decoding at least the 
re, and applying error correction and error 
wherein if the reference picture of the 
er examines an indicator identifying a further 
the current picture with reference to said 
an indicator is associated with the current 



9. A video encoder comprising an input for receiving a video signal 
representing a sequence of pictures, an input for receiving a current picture 
for encoding, means for formirg a temporal prediction of the current picture 
from a default reference pictun? for the current picture, means for comparing 
the default reference picture wll h at least one further reference picture, means 
for calculating a measure of the similarity between the default reference 
picture and each further reference picture and, when the measure of similarity 
meets a pre-determined criterion, outputting an indicator identifying the further 
reference picture and associating the indicator with the temporal prediction of 
the current frame. 
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10. A video decoder 
signal representing a 
pictures that have been encoded 
picture from a default refe 
comprising an input for rece 
current picture, means for deciding 
picture, and applying error 
appropriate, wherein when the 
the decoder is arranged to 
picture and to decode the 
reference picture if such an ind 



iving 



an input for receiving an encoded video 
of pictures, the encoded signal including 
by forming a temporal prediction of a current 
picture for the current picture, the decoder 
an encoded video signal representing a 
at least the picture header of the current 
correction and error concealment methods as 
reference picture of the current picture is lost, 
an indicator identifying a further reference 
current picture with reference to said further 
cator is associated with the current picture 



examine 



11. A radio telecommunicat 
claim 9 and/or a decoder 



according 



ons device including an encoder according to 
to claim 10. 
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ABSTRACT 



V I DEO GOBI 



A method of encoding a video <; 
method comprising receiving 



the current picture, comparing 
further reference picture, calcu 
default reference picture and 
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ignal representing a sequence of pictures, the 
a current picture for encoding, forming a 
temporal prediction of the current picture from a default reference picture for 



he default reference picture with at least one 
ating a measure of the similarity between the 
10 default reference picture and each further reference picture and, if the 
measure of similarity meets a pre-determined criterion, outputting an indicator 
identifying the further reference 
temporal prediction of the current frame. 

15 voOcVn tVc ViaJ^C-GW2Q 9^ClK^>^l W^ssiv-i^^ 
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